Interview has been regarded as one of the most crucial step for recruitment. To fully prepare for the interview with the recruiters, job seekers usually practice with mock interviews between each other. However, such a mock interview with peers is generally far away from the real interview experience: the mock interviewers are not guaranteed to be professional and are not likely to behave like a real interviewer. Due to the rapid growth of online recruitment in recent years, recruiters tend to have online interviews, which makes it possible to collect real interview data from real interviewers. In this paper, we propose a novel application named EZInterviewer, which aims to learn from the online interview data and provides mock interview services to the job seekers. The task is challenging in two ways: (1) the interview data are now available but still of low-resource; (2) to generate meaningful and relevant interview dialogs requires thorough understanding of both resumes and job descriptions. To address the low-resource challenge, EZInterviewer is trained on a very small set of interview dialogs. The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource. Evaluation results on a real-world job interview dialog dataset indicate that we achieve promising results to generate mock interviews. With the help of EZInterviewer, we hope to make mock interview practice become easier for job seekers.
translated by 谷歌翻译
Nowadays, time-stamped web documents related to a general news query floods spread throughout the Internet, and timeline summarization targets concisely summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, in this paper, we propose a Unified Timeline Summarizer (UTS) that can generate abstractive and extractive timeline summaries in time order. Concretely, in the encoder part, we propose a graph-based event encoder that relates multiple events according to their content dependency and learns a global representation of each event. In the decoder part, to ensure the chronological order of the abstractive summary, we propose to extract the feature of event-level attention in its generation process with sequential information remained and use it to simulate the evolutionary attention of the ground truth summary. The event-level attention can also be used to assist in extracting summary, where the extracted summary also comes in time sequence. We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset. Extensive experiments conducted on these datasets and on the out-of-domain Timeline 17 dataset show that UTS achieves state-of-the-art performance in terms of both automatic and human evaluations.
translated by 谷歌翻译
In a citation graph, adjacent paper nodes share related scientific terms and topics. The graph thus conveys unique structure information of document-level relatedness that can be utilized in the paper summarization task, for exploring beyond the intra-document information. In this work, we focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings. We first propose a Multi-granularity Unsupervised Summarization model (MUS) as a simple and low-cost solution to the task. MUS finetunes a pre-trained encoder model on the citation graph by link prediction tasks. Then, the abstract sentences are extracted from the corresponding paper considering multi-granularity information. Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework. Motivated by this, we next propose a Graph-based Supervised Summarization model (GSS) to achieve more accurate results on the task when large-scale labeled data are available. Apart from employing the link prediction as an auxiliary task, GSS introduces a gated sentence encoder and a graph information fusion module to take advantage of the graph information to polish the sentence representation. Experiments on a public benchmark dataset show that MUS and GSS bring substantial improvements over the prior state-of-the-art model.
translated by 谷歌翻译
由于视频序列中的大量嘈杂框架,野外动态面部表达识别(DFER)是一项极具挑战性的任务。以前的作品着重于提取更多的判别特征,但忽略了将关键帧与嘈杂框架区分开来。为了解决这个问题,我们提出了一个噪声动态的面部表达识别网络(NR-DFERNET),该网络可以有效地减少嘈杂框架对DFER任务的干扰。具体而言,在空间阶段,我们设计了一个动态静态融合模块(DSF),该模块(DSF)将动态特征引入静态特征,以学习更多的判别空间特征。为了抑制目标无关框架的影响,我们在时间阶段引入了针对变压器的新型动态类令牌(DCT)。此外,我们在决策阶段设计了基于摘要的滤镜(SF),以减少过多中性帧对非中性序列分类的影响。广泛的实验结果表明,我们的NR-dfernet优于DFEW和AFEW基准的最先进方法。
translated by 谷歌翻译
面部微表达(MES)是非自愿的面部动作,揭示了人们的真实感受,并在精神疾病,国家安全和许多人类计算机互动系统的早期干预中起着重要作用。但是,现有的微表达数据集有限,通常对培训良好的分类器构成一些挑战。为了建模微妙的面部肌肉运动,我们提出了一个健壮的微表达识别(MER)框架,即肌肉运动引导网络(MMNET)。具体而言,引入了连续的注意(CA)块,专注于对局部微妙的肌肉运动模式进行建模,几乎没有身份信息,这与大多数以前的方法不同,这些方法直接从完整的视频框架中提取具有许多身份信息的方法。此外,我们根据视觉变压器设计一个位置校准(PC)模块。通过添加PC模块在两个分支末端产生的面部的位置嵌入,PC模块可以帮助将位置信息添加到MER的面部肌肉运动图案中。在三个公共微表达数据集上进行的广泛实验表明,我们的方法以大幅度优于最先进的方法。
translated by 谷歌翻译
我们介绍了一种新颖的骨干架构,提高特征表示的目标感知能力。具体地,已经观察到事实上框架简单地使用来自骨干网的输出来执行特征匹配,从备份目标本地化,没有从匹配模块到骨干网的直接反馈,尤其是浅层。更具体地,只有匹配模块可以直接访问目标信息(在参考帧中),而候选帧的表示学习对参考目标是盲目的。结果,浅级中的目标 - 无关干扰的累积效果可能降低更深层的特征质量。在本文中,我们通过在暹罗类似的骨干网(inbn)内进行多个分支 - 方面交互来从不同角度接近问题。在INBN的核心是一个通用交互建模器(GIM),其将参考图像的先前知识注入骨干网络的不同阶段,导致候选特征表示的更好的目标感知和鲁棒的牵引力,其计算成本具有可忽略的计算成本。所提出的GIM模块和INBN机制是一般的,适用于不同的骨干类型,包括CNN和变压器,以改进,如我们在多个基准上的广泛实验所证明的那样。特别是,CNN版本(基于Siamcar),分别在Lasot / TNL2K上改善了3.2 / 6.9的Suc绝对收益。变压器版本获取Lasot / TNL2K的SUC 25.7 / 52.0,与最近的艺术态度相提并论。代码和模型将被释放。
translated by 谷歌翻译
In this paper, a semantic communication framework for image transmission is developed. In the investigated framework, a set of servers cooperatively transmit images to a set of users utilizing semantic communication techniques. To evaluate the performance of studied semantic communication system, a multimodal metric is proposed to measure the correlation between the extracted semantic information and the original image. To meet the ISS requirement of each user, each server must jointly determine the semantic information to be transmitted and the resource blocks (RBs) used for semantic information transmission. We formulate this problem as an optimization problem aiming to minimize each server's transmission latency while reaching the ISS requirement. To solve this problem, a value decomposition based entropy-maximized multi-agent reinforcement learning (RL) is proposed, which enables servers to coordinate for training and execute RB allocation in a distributed manner to approach to a globally optimal performance with less training iterations. Compared to traditional multi-agent RL, the proposed RL improves the valuable action exploration of servers and the probability of finding a globally optimal RB allocation policy based on local observation. Simulation results show that the proposed algorithm can reduce the transmission delay by up to 16.1% compared to traditional multi-agent RL.
translated by 谷歌翻译
本文考虑通过模型量化提高联邦学习(FL)的无线通信和计算效率。在提出的Bitwidth FL方案中,Edge设备将其本地FL模型参数的量化版本训练并传输到协调服务器,从而将它们汇总为量化的全局模型并同步设备。目的是共同确定用于本地FL模型量化的位宽度以及每次迭代中参与FL训练的设备集。该问题被视为一个优化问题,其目标是在每卷工具采样预算和延迟要求下最大程度地减少量化FL的训练损失。为了得出解决方案,进行分析表征,以显示有限的无线资源和诱导的量化误差如何影响所提出的FL方法的性能。分析结果表明,两个连续迭代之间的FL训练损失的改善取决于设备的选择和量化方案以及所学模型固有的几个参数。给定基于线性回归的这些模型属性的估计值,可以证明FL训练过程可以描述为马尔可夫决策过程(MDP),然后提出了基于模型的增强学习(RL)方法来优化动作的方法选择迭代。与无模型RL相比,这种基于模型的RL方法利用FL训练过程的派生数学表征来发现有效的设备选择和量化方案,而无需强加其他设备通信开销。仿真结果表明,与模型无RL方法和标准FL方法相比,提出的FL算法可以减少29%和63%的收敛时间。
translated by 谷歌翻译
图形卷积神经网络(GCN)吸引了越来越多的注意力,并在各种计算机视觉任务中取得了良好的表现,但是,对GCN的内部机制缺乏明确的解释。对于标准的卷积神经网络(CNN),通常使用类激活映射(CAM)方法通过生成热图来可视化CNN的决策和图像区域之间的连接。尽管如此,当这些凸轮直接应用于GCN时,这种热图通常会显示出语义 - chaos。在本文中,我们提出了一种新颖的可视化方法,特别适用于GCN,顶点语义类激活映射(VS-CAM)。 VS-CAM包括两个独立的管道,分别制作一组语义探针图和一个语义基映射。语义探针图用于检测语义信息从语义碱图图中的语义信息,以汇总语义感知的热图。定性结果表明,VS-CAM可以获得与基于CNN的CAM更精确地匹配对象的热图。定量评估进一步证明了VS-CAM的优势。
translated by 谷歌翻译
在这项工作中,我们考虑了具有多个基站和间隔干扰的无线系统中的联合学习模型。在学习阶段,我们应用了一个不同的私人方案,将信息从用户传输到其相应的基站。我们通过在其最佳差距上得出上限来显示学习过程的收敛行为。此外,我们定义了一个优化问题,以减少该上限和总隐私泄漏。为了找到此问题的本地最佳解决方案,我们首先提出了一种计划资源块和用户的算法。然后,我们扩展了该方案,以通过优化差异隐私人工噪声来减少总隐私泄漏。我们将这两个程序的解决方案应用于联合学习系统的参数。在这种情况下,我们假设每个用户都配备了分类器。此外,假定通信单元的资源块比用户数量少。仿真结果表明,与随机调度程序相比,我们提出的调度程序提高了预测的平均准确性。此外,其具有噪声优化器的扩展版本大大减少了隐私泄漏的量。
translated by 谷歌翻译